智能论文笔记

Programmable and Customized Intelligence for Traffic Steering in 5G Networks Using Open RAN Architectures

Andrea Lacava , Michele Polese , Rajarajan Sivaraj , Rahul Soundrarajan , Bhawani Shanker Bhati , Tarunjeet Singh , Tommaso Zugno , Francesca Cuomo , Tommaso Melodia

分类：人工智能

2022-09-28

5G及以后的移动网络将以前所未有的规模支持异质用例，从而要求自动控制和优化针对单个用户需求的网络功能。当前的蜂窝体系结构不可能对无线电访问网络（RAN）进行这种细粒度控制。为了填补这一空白，开放式运行范式及其规范引入了一个带有抽象的开放体系结构，该架构可以启用闭环控制并提供数据驱动和智能优化RAN在用户级别上。这是通过在网络边缘部署在近实时RAN智能控制器（接近RT RIC）上的自定义RAN控制应用程序（即XAPP）获得的。尽管有这些前提，但截至今天，研究界缺乏用于构建数据驱动XAPP的沙箱，并创建大型数据集以有效的AI培训。在本文中，我们通过引入NS-O-RAN来解决此问题，NS-O-RAN是一个软件框架，该框架将现实世界中的生产级近距离RIC与NS-3上的基于3GPP的模拟环境集成在一起，从而实现了XAPPS和XAPPS的开发自动化的大规模数据收集和深入强化学习驱动的控制策略的测试，以在用户级别的优化中进行优化。此外，我们提出了第一个特定于用户的O-RAN交通转向（TS）智能移交框架。它使用随机的合奏混合物，结合了最先进的卷积神经网络体系结构，以最佳地为网络中的每个用户分配服务基站。我们的TS XAPP接受了NS-O-RAN收集的超过4000万个数据点的培训，该数据点在近距离RIC上运行，并控制其基站。我们在大规模部署中评估了性能，这表明基于XAPP的交换可以使吞吐量和频谱效率平均比传统的移交启发式方法提高50％，而动机性开销较少。

translated by 谷歌翻译

Intelligent Closed-loop RAN Control with xApps in OpenRAN Gym

Leonardo Bonati , Michele Polese , Salvatore D'Oro , Stefano Basagni , Tommaso Melodia

分类：机器学习

2022-08-31

预示着在不同时间尺度上作用的软件化，可编程网络控制和使用作用的全包装控制器的使用，作为下一代蜂窝网络发展的关键驱动力。这些技术已经培养了新设计的智能数据驱动的解决方案，用于管理大量各种蜂窝功能，基本上不可能在传统上闭合的蜂窝体系结构中实施。尽管行业对人工智能（AI）和机器学习（ML）解决方案具有明显的兴趣，该解决方案是对无线电访问网络（RAN）的闭环控制，并且该领域的几项研究工作远非主流，但仍然是一个复杂的操作，而且经常被忽略。在本文中，我们讨论了如何为开放式RAN的智能闭环控制设计AI/ML解决方案，从而根据具有高性能记录的示例解决方案提供指南和见解。然后，我们展示如何通过OpenRan Gym在O-RAN近实时RAN智能控制器（RIC）上实例化这些解决方案，Openran Gym是第一个用于数据驱动的O-RAN实验的公共可用工具箱。我们展示了一个由OpenRan Gym开发的XAPP的用例，并在蜂窝网络上进行了测试，其中有7个基站和42位用户部署在Colosseum Wireless网络模拟器上。我们的演示表明，位于Openran的XAPP开发环境的高度灵活性，该环境与部署方案和交通需求无关。

translated by 谷歌翻译

OpenRAN Gym: AI/ML Development, Data Collection, and Testing for O-RAN on PAWR Platforms

Leonardo Bonati , Michele Polese , Salvatore D'Oro , Stefano Basagni , Tommaso Melodia

分类：机器学习

2022-07-25

开放式无线电访问网络（RAN）体系结构将在下一代蜂窝网络中启用互操作性，开放性和可编程数据驱动控制。但是，开发和测试有效的解决方案，这些解决方案跨越了异质的细胞部署和量表，并在如此多样化的环境中优化网络性能是一项复杂的任务，这是一项复杂的任务，仍然在很大程度上没有探索。在本文中，我们介绍了OpenRan Gym，这是一个统一，开放和O-Ran符合的实验工具箱，用于数据收集，设计，原型设计和测试下一代Open RAN Systems的端到端数据驱动的控制解决方案。 OpenRan Gym扩展并结合了一个独特的解决方案，几个软件框架用于数据收集统计和控制控制，以及轻巧的O-Ran近实时RAN智能控制器（RIC）量身定制，可在实验性无线平台上运行。我们首先概述了OpenRan Gym的各种建筑组件，并描述了如何按大规模收集数据和设计，训练和测试人工智能和机器学习O-Ran-Commiate应用程序（XAPP）。然后，我们详细描述了如何在SoftWarized Rans上测试开发的XAPP，并提供了一个使用OpenRan Gym开发的两个XAPP的示例，这些XAPP用于控制一个具有7个基站的网络，并在奥马斗马会测试中部署了42个用户。最后，我们展示了如何通过罗马竞技场上的Openran Gym开发的解决方案，可以将其导出到现实世界中的异质无线平台，例如Arena Testbed以及PAWR计划的粉末和宇宙平台。 OpenRan Gym及其软件组件是开源的，并且对研究社区公开可用。

translated by 谷歌翻译

ColO-RAN: Developing Machine Learning-based xApps for Open RAN Closed-loop Control on Programmable Experimental Platforms

Michele Polese , Leonardo Bonati , Salvatore D'Oro , Stefano Basagni , Tommaso Melodia

分类：机器学习

2021-12-17

尽管开放式运输所带来的新机遇，但基于ML的网络自动化的进步已经缓慢，主要是因为大规模数据集和实验测试基础设施的不可用。这减缓了实际网络上的深度加强学习（DRL）代理的开发和广泛采用，延迟了智能和自主运行控制的进展。在本文中，我们通过提出用于开放式RAN基于DRL基闭环控制的设计，培训，测试和实验评估的实用解决方案和软件管道来解决这些挑战。我们介绍了Colo-RAN，这是一个具有软件定义的无线电循环的第一个公开的大型O-RAN测试框架。在ColoSseum无线网络仿真器的规模和计算能力上，Colo-RAN使用O-RAN组件，可编程基站和“无线数据厂”来实现ML研究。具体而言，我们设计并开发三种示例性XApp，用于基于DRL的RAN切片，调度和在线模型培训，并评估其在具有7个软化基站和42个用户的蜂窝网络上的性能。最后，我们通过在竞技场上部署一个室内可编程测试平台来展示Colo-RAN到不同平台的可移植性。我们的一类大型评估的广泛结果突出了基于DRL的自适应控制的益处和挑战。他们还提供关于无线DRL管道的开发的见解，从数据分析到DRL代理商的设计，以及与现场训练相关的权衡。 Colo-RAN和收集的大型数据集将公开向研究界公开提供。

translated by 谷歌翻译

Colosseum: Large-Scale Wireless Experimentation Through Hardware-in-the-Loop Network Emulation

Leonardo Bonati , Pedram Johari , Michele Polese , Salvatore D'Oro , Subhramoy Mohanti , Miead Tehrani-Moayyed , Davide Villa , Shweta Shrivastava , Chinenye Tassie , Kurt Yoder

分类：人工智能

2021-10-20

Colorsseum是一种开放式和公开可用的大型无线无线测试，可通过虚拟化和软载波形和协议堆栈进行实验研究，在完全可编程的“白盒子”平台上。通过256最先进的软件定义的无线电和巨大的通道仿真器核心，罗马斗兽场几乎可以模拟任何方案，在各种部署和渠道条件下，可以在规模上进行设计，开发和测试解决方案。通过有限脉冲响应滤波器通过高保真FPGA的仿真再现这些罗马孔射频场景。过滤器模拟所需的无线通道的抽头，并将它们应用于无线电节点生成的信号，忠实地模拟现实世界无线环境的条件。在本文中，我们将罗马斗兽场介绍为测试楼，这是第一次向研究界开放。我们描述了罗马斗兽场的建筑及其实验和仿真能力。然后，我们通过示例性用例证明了罗马斗兽场对实验研究的有效性，包括频谱共享和无人空中车辆场景的普遍用途用例，包括普遍的无线技术（例如，蜂窝和Wi-Fi）。斗兽索斗兽场未来更新的路线图总结了这篇论文。

translated by 谷歌翻译

Time series Forecasting to detect anomalous behaviours in Multiphase Flow Meters

Tommaso Barbariol , Davide Masiero , Enrico Feltresi , Gian Antonio Susto

分类：机器学习 | 人工智能

2022-12-30

An Anomaly Detection (AD) System for Self-diagnosis has been developed for Multiphase Flow Meter (MPFM). The system relies on machine learning algorithms for time series forecasting, historical data have been used to train a model and to predict the behavior of a sensor and, thus, to detect anomalies.

translated by 谷歌翻译

Hierarchically branched diffusion models for efficient and interpretable multi-class conditional generation

Alex M. Tseng , Tommaso Biancalani , Max Shen , Gabriele Scalia

分类：机器学习 | 人工智能

2022-12-21

Diffusion models have achieved justifiable popularity by attaining state-of-the-art performance in generating realistic objects from seemingly arbitrarily complex data distributions, including when conditioning generation on labels. Unfortunately, however, their iterative nature renders them very computationally inefficient during the sampling process. For the multi-class conditional generation problem, we propose a novel, structurally unique framework of diffusion models which are hierarchically branched according to the inherent relationships between classes. In this work, we demonstrate that branched diffusion models offer major improvements in efficiently generating samples from multiple classes. We also showcase several other advantages of branched diffusion models, including ease of extension to novel classes in a continual-learning setting, and a unique interpretability that offers insight into these generative models. Branched diffusion models represent an alternative paradigm to their traditional linear counterparts, and can have large impacts in how we use diffusion models for efficient generation, online learning, and scientific discovery.

translated by 谷歌翻译

Localising In-Domain Adaptation of Transformer-Based Biomedical Language Models

Tommaso Mario Buonocore , Claudio Crema , Alberto Redolfi , Riccardo Bellazzi , Enea Parimbelli

分类：自然语言处理 | 人工智能 | 机器学习

2022-12-20

In the era of digital healthcare, the huge volumes of textual information generated every day in hospitals constitute an essential but underused asset that could be exploited with task-specific, fine-tuned biomedical language representation models, improving patient care and management. For such specialized domains, previous research has shown that fine-tuning models stemming from broad-coverage checkpoints can largely benefit additional training rounds over large-scale in-domain resources. However, these resources are often unreachable for less-resourced languages like Italian, preventing local medical institutions to employ in-domain adaptation. In order to reduce this gap, our work investigates two accessible approaches to derive biomedical language models in languages other than English, taking Italian as a concrete use-case: one based on neural machine translation of English resources, favoring quantity over quality; the other based on a high-grade, narrow-scoped corpus natively written in Italian, thus preferring quality over quantity. Our study shows that data quantity is a harder constraint than data quality for biomedical adaptation, but the concatenation of high-quality data can improve model performance even when dealing with relatively size-limited corpora. The models published from our investigations have the potential to unlock important research opportunities for Italian hospitals and academia. Finally, the set of lessons learned from the study constitutes valuable insights towards a solution to build biomedical language models that are generalizable to other less-resourced languages and different domain settings.

translated by 谷歌翻译

Annual field-scale maps of tall and short crops at the global scale using GEDI and Sentinel-2

Stefania Di Tommaso , Sherrie Wang , Vivek Vajipey , Noel Gorelick , Rob Strey , David B. Lobell

分类：计算机视觉 | 机器学习

2022-12-19

Crop type maps are critical for tracking agricultural land use and estimating crop production. Remote sensing has proven an efficient and reliable tool for creating these maps in regions with abundant ground labels for model training, yet these labels remain difficult to obtain in many regions and years. NASA's Global Ecosystem Dynamics Investigation (GEDI) spaceborne lidar instrument, originally designed for forest monitoring, has shown promise for distinguishing tall and short crops. In the current study, we leverage GEDI to develop wall-to-wall maps of short vs tall crops on a global scale at 10 m resolution for 2019-2021. Specifically, we show that (1) GEDI returns can reliably be classified into tall and short crops after removing shots with extreme view angles or topographic slope, (2) the frequency of tall crops over time can be used to identify months when tall crops are at their peak height, and (3) GEDI shots in these months can then be used to train random forest models that use Sentinel-2 time series to accurately predict short vs. tall crops. Independent reference data from around the world are then used to evaluate these GEDI-S2 maps. We find that GEDI-S2 performed nearly as well as models trained on thousands of local reference training points, with accuracies of at least 87% and often above 90% throughout the Americas, Europe, and East Asia. Systematic underestimation of tall crop area was observed in regions where crops frequently exhibit low biomass, namely Africa and South Asia, and further work is needed in these systems. Although the GEDI-S2 approach only differentiates tall from short crops, in many landscapes this distinction goes a long way toward mapping the main individual crop types. The combination of GEDI and Sentinel-2 thus presents a very promising path towards global crop mapping with minimal reliance on ground data.

translated by 谷歌翻译

Towards Assessing Data Bias in Clinical Trials

Chiara Criscuolo , Tommaso Dolci , Mattia Salnitri

分类：人工智能 | 机器学习

2022-12-19

Algorithms and technologies are essential tools that pervade all aspects of our daily lives. In the last decades, health care research benefited from new computer-based recruiting methods, the use of federated architectures for data storage, the introduction of innovative analyses of datasets, and so on. Nevertheless, health care datasets can still be affected by data bias. Due to data bias, they provide a distorted view of reality, leading to wrong analysis results and, consequently, decisions. For example, in a clinical trial that studied the risk of cardiovascular diseases, predictions were wrong due to the lack of data on ethnic minorities. It is, therefore, of paramount importance for researchers to acknowledge data bias that may be present in the datasets they use, eventually adopt techniques to mitigate them and control if and how analyses results are impacted. This paper proposes a method to address bias in datasets that: (i) defines the types of data bias that may be present in the dataset, (ii) characterizes and quantifies data bias with adequate metrics, (iii) provides guidelines to identify, measure, and mitigate data bias for different data sources. The method we propose is applicable both for prospective and retrospective clinical trials. We evaluate our proposal both through theoretical considerations and through interviews with researchers in the health care environment.

translated by 谷歌翻译